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1. INTTIODU CTION 

Onf of Shannon s nmnv fundnme nlal contributions was bis result 
Ihcl source coiling a ml channel coding enn be treated separately with- 
out any Joss of performance for the overall system []). The basic design 
procedure is to select a source encoder which changes the source se- 
quence into a series of independent, equally likely binary digits followed 
by a channel encoder which accepts binary digits and puls them into 
a form suitable for reliable transmission over the channel. However, 
the separntion argument no longer holds if either of the following two 
situations occur: 

i. The input to the source decoder is different from the 
output of the source encoder, which happens when the 
link between the source encoder and source decoder is 
no longer error free, or 

ii. when the source encoder output contains redundancy. 

Of course, case (i) occurs when the channel coder does not achieve 
xero error probability and case (ii) occurs when the source encoder is 
suboptimal. These two situations are common occurrences in practical 
systems where source or channel models are imperfectly known, com- 
plexity is a serious issue, or significant delay is not tolerable. Various 
approaches have been developed for such situations. They are usually 
grouped under the general heading of joint sourcc/chanuel coding. 

Most of t ho various joint source channel coding approaches can 
be classified in two main categories; (A) approaches which entail the 
modification of the source codcr/decode r structure to reduce the effect 
of channel errors, and (D) approaches which examine the distribution of 
bits between the source and channel coders. The first set of approaches 
can be divided still further into two classes. One class of approaches 
examines the modification of l lie overall structure, while the other deals 
with the modification of the decoding procedure to lake advantage of 
the redundancy in the output of the source coder. 

To the first class belongs the work of Dunham k Gray (_>] v),o 
proved the existence of joint source channel trellis coding syslfius for 
certain fidelity criteria, and a design of a joint source channel trellis 
coder presented by Ayauoglu and Gray (3), where the design procedure 
is the generalized Lloyd algorithm. Further, Massey [4] and Anrhela 
|5) showed that for distortionless transmission of the source using linear 
joint source •channel encoders, equivalent performance can be obtained 
with a significant reduction in complexity. Chang and Donaldson [Gj 
propose modifications to the Dl 5 CM system to reduce the effect of 
clioiiiic! errors, while Kurlenbach and Wintx (7) nnd Farvardin and 
Vais ham pay an (8) study the. problem of optimum quantiser design for 
noisy channels. Goodman and Sundbcrg (9,10) propose an embedded 
DPC..I system which consists of a two bit DPCM end a two bit PCM 
system in parallel. 

In the second class of category A, we include the work of Reininger 
and Gibson [M], who use the fact that coefficients in neighboring blocks ! 
in a transform coding scheme will not vary greatly, and thus use coe/n- : 
cients from neighboring blocks to correct a possible error, end the work i 
of Steele, Goodman and McGonegn! [J2, J3], who propose a di/Tcrcncc 1 
detection and correction scheme for broadcast quality speech. In this 
scheme the receiver infers an error whenever an individual sample to 
sample difference is greater limn the mean squared difference of a 2J 
sample sliding block. When an error is delected, the received sam pic 
is replaced by the output of a smoothing circuit. Ngar. and Steele (Ml 
"sc a snmlftT method for recovering from errors in an image transmis- 
s.on system. Sayocd and Borkenhngcn [15,10] use the redundancy at 
U.e source coder output to perform sequence estimation. 
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The work of Modcstino, Daut and Vickers (17) belongs to category 
B. In their study of transform coding they examine trodeoffs between 
allocating bits for source end channel coding. Comstock and Gibson 
[18] extend this work and provide an explicit mechanism for allocating 
bits between a source coder and a Hamming channel coder. Addi- 
tionally, Moore and Gibson (}9) study the allocation of bits between a 
DPCM coder and self orthogonal convolutional coding. 

In this paper we present a maximum aposteriori probability (MAP) 
approach to joint source/channcl coder design, which belongs to cate- 
gory A, and hence we explore a technique for designing joint source/ 
channel coders, rather than ways of distributing bits between source 
coders and channel coders. We assume that the two nonideal situations 
referred to earlier are present. Our approach is as follows. For a 
nonideal source coder, we use MAP arguments to design a decoder 
which lakes advantage of redundancy in the source coder output to 
perform error correction. Once the decoder is obtained, we analyze 
it with the purpose of obtaining “desirable properties" of the channel 
input sequence for improving overall system performance. YYe then 
propose an encoder design which incorporates these properties. 

2. THE MAP DESIGN CRITERION 

For a discrete mcmoryless channel (DMC), Itl the channel input 
alphabet be denoted by A = {n 0p n Xl and the channel 
input and output sequences by }' = {vo.i/i,---, £7,-i) mid Y ~ 

{vo>V\ Vl-i}. respectively. If A =z {/!,♦} is the set of srqnencrx 

-d; = {o,,o. c*;,i then the optimum receiver (in 

the sense of maximizing the probability of making a correct decision) 
maximizes P(C], where 

P\C) = 2>[c|f]r»(f]. 

A { 

This in turn implies that the optimum receiver maximizes P(C’|)j. 
When the receiver selects the output to be Ay, then -P[C|i’] = P[Y = 
Ak|}']. Thus, the optimum receiver selects the sequence Ay such that 

PlY = Ak\Y]> PlY = A i \Y)Y { . 

Lemma 1 

Let y,- be the input to a DMC. Given y.-.j.y,- is conditionally inde- 
pendent of y n .k,^ > 1. If yo = yo tlien tl»c optimum receiver selects a 
sequence A ; to maximize nf_” l l p(y l * Jy,-_ j , y,). 

Proof: 

From the preceding result, the receiver tries to maximize P(yji'). 
Using the chain rule we can write this as 

P[Y\Y) = P(yo,yi yc-ilyo, yj ■ . . yt-i) 

= ^(yt-ilvL-3, yi-s, . . . , yo, yo y^-i) 

^(yt-slyi-s. • • • . yo, yo yt-i) • • • -P{y 0 |j/o Vl-\) 

The lost factor on the right hand side (RHS) is equal to one. Using 
the assumption of the DMC, we obtain 

•P(V|V) = nf-“iV(y.|y.-i. y.)- (l) 

□ 

The leinma eddrcsscs the situation in case (ii), i.c., the situation 
in which the source coder output (which is also (lie channel input 
sequence) con t amis redundancy. Using tins lemma, w C can design a 
drrodff wbirb will take advantage 1 of (trprnilrnvr in tlu* ehnniud iit|Htl 
sequence*. 




3. DECODER DESIGN 

The lemma of the previous section provides the mollif mnticnl stme- 
llllc f or the decoder. The physical structure con be easily obtained 
by examining the quantity to be maximised. The decoder maximises 
/>()*|Y) or equivalently logP(Y|)') t but 

log P{Y\Y) = los y.- 1) ( 2 ) 

end various solutions exist for the maximisation of additive path met- 
rics. To implement this decoder wt need to be able to compute the 
noth metric. This task is considerably eased by the following lemma. 
Lemma 2 Let A be the channel input alphabet and {y,} and {y,} be 
the input and output sequences of a DMC. Then 

P[y, - o>Iy;-i = °m, y; = Q*) = 

P\y, ~ = aj)R(y; = = n »«) {3) 

P\ y. ~ n i lyi- 1 = °m}P (y* = i/i ~ n ‘] 

P rim f: See [ l l»j . 

The expression on the U US of ( 3 ), while it looks more e„,„p!i.*;.lrd, 
is ;»e tnally n much iimic Imrlnhli* form of the. desired rmidilfoiial pro!.- 
nbilily. Note llmt this expression is a function of two disltml m Is of 
transition probabilities, the channel transition probabilities and the 
source coder output transition probabilities. As the channel transition 
probabilities depend only on the channel, and the sourer coder output 
transition probabilities depend only on the source coder and source 
probabilities, the two sets of transition probabilities can be estimated 
Independently. The two can then be combined according to ( 3 ) to con- 
struct a M x M x M lookup table for use in decoding. If the source 
coder or source changes, the only parameters to be modified ore the 
source coder output transition probabilities. Results using a DPCM 
source coder with an image as the source arc presented in Section 6. 

4. DECODER ANALYSIS 

In the previous section we developed a scheme for providing error 
correction using the redundancy in the channel input sequence, or the 
source coder output. We looked at the design of the decoder given 
a source coder or channel input sequence with some rather general 
statistical properties. In this section we examine the reverse problem. 
That is, given the decoder obtained in the previous section we look 
for “desired properties" of the channel input sequence and, hence, the 
source coder. 

To obtain the desired properties we need to examine the factors 
involved in the error correcting capability of the decoder. Toward 
this end let us examine the following situation. Referring to Figure J, 
assume that the correct sequence of transmitted codewords is n 0 aocio- 
An error occurs if the petli metric for aoa ; Co is greater than the path 
metric of Conoco* Assume yj = o n and yj = oo. An error occurs if the . 
following quantity is positive. j 

. R[yi = Qf.|yi = Q;]^(vr - Q j\yo = go) 

° S Si ^(yi = c ilvo = °o)R[yi = «»i|yi = °i) 

. ] Rj fc = fcolya = Q ol^ l y: = oolyi = Q ; ) 

^ ° £ Ei - °/lyt = c ;]^[y 3 = °o|y: ” c <) 

_ . P[y t - Qnjyi = &o]^[yi = Q o|yo ~ c o 1 

£ Ei ^[yi = d|yo = ao)R[yi = °n|yi = «<] 

P\h - Qoly? = Qo]R[y 3 = qq 1 st = n o | ( 4 j 

Ei ^[yj = *t\yi = G o]^(y3 = «o|yj = oi] 


Defining 


dij = Hamming distance between a,* and a ; - 

ffu = R(y. = cr|y,-_i = 0*) 

then using ( 5 ) and simplifying ( 4 ), an error occurs if 


(*) 


- d„o) log(~ — ) + log + log ^ - log ^ f ' / yt., > 0 
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Defining o = ^ , (6) enn be rewritten as 


Q %0 % V**j i ^ 
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'•The left bond side is maximized when j = »( d n j = 0). Thus an error 
‘occurs when the number of bit errors, which in tins ease is i/„ 0 , is 
greaLcr tlian live quantity on the LJIS of (S) or 
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The niter lift live |mth shown in lhgurr I is only our uf possible pnlhs. 
Another longer nltcrnnlivc put It 15 shown in lignre *2. 

In this case llic number of mors rrijniml to tnkr tin* nltrrmili v«* path 
is given by 

<i„o > ^..,o + r— f lo g — + lo S — ) 
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Notice that the number of errors jerjnired to tnkc ft longer incorrect 
path (a path with more branches) is larger than for a shorter incorrect 
path. To innke our statements more cnnrrrlr, wr define n prurmirler 
we call the error correction cnpabilily / »s 


/ = J - (V' n |VV,_i)/ Jog j\J = 

1 - : — -—r *5^ p(y,i = or, y.i-i = «a) log — ~TT 

t log M ^ ^(y.. — fi i)!y»-t “ 

- 1 - p* 1 -— Ep(Vn “ °f. yn-1 = A*)leg (11) 

log Af yf 9 tk 

where M is tlic size of the channel input alphabet. We immediately 
i note the following properties of / 

(i) / is a convex cup function of the conditional probabilities 

(ii) o < / < 1. 

Further properties of / arc developed in the following lemmas. 
Lemma 3: If J is zero for o particular channel input sequence, the 
decoder will not correct any errors. 

Proof: 

/ is zero when 

p(y*. y*-i) log — = 1°6 M 

X—' 0/1 


This is true when g t i = log ^7 for all /, A\ In Ihi^ rondHion the right 
hand side of (9) is zero giving live desired result. 

D 

Lemma 4: If I is one for n particular input sequence, the dreodrr 
obtains the correct sequence with probability one. 

Proof: 

For / to be one, U[y„ |»/.._ \ ) has to be zero. 'This is true if f«>r enrh 
Jk 0 there exists an f 0 such that 

_ / I. ' = 'o 

” \ 0 , l ? t 0 - 


This ill turn implies that there exists some to such that 


P{) 


»>M.) = { 0’ 

• = Ain = { 


i = i*o 
i ^ 1*0 • 


I , f = in 

0, f */• in 


Thus 



and the decode: will pick the correct sequence with probability one. 

O 

The above two Irmmu provide a relationship between the value of 
I and the cnor collecting capability of the decoder, for the extreme 
values of I. To obtain on insight into l lie relationship for other values 
of / we look at n simplified version of (9). Assume that the size of the 
channel input alphabet is two, then (9) simplifies to 


</io > 


( .05 222 + lo g 222 + log ( »" .. + r^p ii)) . 
logo \ $10 $01 \$t>0 + O ’$10// 


(V 2 ) 


Noting that r?oo = 1 - Pio, $oi = 1 - $n, and for the right hand side 
to be positive poo > \ tnd $01 < \ w c can rewrite (12) oj 
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Poo 
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4 log 
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(13) 


In (in) the larger the right hand side the greater 5s the error correcting 
capability of the receiver. The right hand side can be increased by 
decreasing p 0 i below Thus the error correcting capability increases 
as poi decreases below i. If we examine I we find that / increases os 
y 0 , decreases below J. This is because 


H (y„|i'„.i) = po ((Poo) log 4 PioJog 

+ Pi (poi log 4 pi ( 3 - Pci) log 1T777) O 4 ) 

decreases with $01 decreasing below 1. Tims for this simple example, 
an increase in I means an increase in the error correcting capability of 
the decoder. 

5. ENCODER DESIGN 

In the previous section we obtained desirable properties for the 
channel input/encodcr output sequence. In this section w*e examine 
ways of incorporating these desireable properties into the encoder. We 
wish to do tins without decreasing the redundancy removal capability 
of the source coder, and if possible, without increasing the transmitted 
bit rale. To sec how to approach this problem lei us first examine the 
source coder for noiseless channels in some detail. 

In general, a source coder consists of two operations, data com- 
pression and data compaction [20], The data compression operation- 
usually consists of redundancy removal and involves some loss of infor- 
mation. Examples of data compression schemes are DPCM, transform 
coding and vector quantization. The data compaction schemes are 
information preserving. They may result in a variable rate out. Ex- . 
nmples include Huffman coding and runlenglh coding. Generally in' 
discussions of joint scuice/channel coder design, the data compaction 
operations are not included. The reason for this is that due to the vari- 
able rate output, the data compaction schemes are highly vulnerable 
to channel noise and, therefore, arc not considered for noisy channel 
applications. ■ 

A possible way of achieving our objectives is to insert another op- 
eration between the data compression and data compaction steps as 
shown in Figure 3. 

To satisfy our objectives, the II operator should have the following 
proper ties. 

(a) The FI operator should perform distortionless encoding. 

(b) The fl operator should increase the error correcting capa- 
bility. 

(e) The II operator should not incrco.se the hit ialr. For the 
ease whrre the data compaction scheme is n llnlVmnn cotier, 
this 15 rr|iiivnlrn( to the condition that the ouplul entropy 
not be greater than the input entropy. 

An example of the II operator winch satisfies (a) anil (b) ami which 
can be modified to satisfy (c) functions as follows. Let the input to II 
be selected from the alphabet 

A = {co,rti fl//- 1 }, 

and let the output alphabet be denoted by 

5 = 


Then the inpul/outpul mapping is given by 


x n = fl», 1 = n; =* Pn = ->,// + .• (Jf») 

The elTccl of the fl operator is to increase the distance between 


alternative sequencej. To see this, 
Let A = {flo» n 1 } 5 = { J o . <* 1 , • 


y„ = 

• n O 

if 

T, t Of) 

and 

y.. = 

u 

if 

t„ r* . 1 , 

and 

•j, t = 


if 

T„ n 0 

and 

- 

->3 

if 

x, , — <l| 

and 


let ns construct a simple example. 
: , } then 

y n . 1 « | ii • 

. 1 ~ " 11 , 

* n • I “ " 1 I 
- I “ *' I • 


In this cose if y„ = -«o. JA. + i cannot be sj or .• 3 berausr -r .« 9 
means y M = flo, and i/„ +1 = .t; or s.t means s„ — "i. 1 bus a *!♦-»-• •• I »-*. I 
-sequence cannot have j; or s 3 following « 0 . 

For simplicity Id us ignore the Huffman roder rmd assign fixed 
length codewords to the s, as 


5 n : 00 , 5, : 01 , 5 : : It), .< 3 : 11 

Now suppose the Imusmi lied sentience was tin* nil *ero .«ei|iienee, 
the metric, used was the Hamming di* latter, ami the rereived scqueiier 
is 00001000000000; that is, there is nn error in the fifth bit. If the 
receiver decoded the first four bits os sqiq then it cannot decode the 
fifth and sixth bits as j 3 for the reason noted above. The only two 
options are decoding them as s 0 °r jj. If we decoded them os s 0 , we 
could continue decoding the rest of the sequence as j 0 j 0 ..., and the 
Hamming distance between the received and decoded sequence would 
be one. If we decoded them as Ji, we would have to decode the next 
set of two bits as s 3 or s 3 because so cannot follow s\. Decoding as 
.«j gives the smallest Hamming distance so we decode the seventh and 
eighth bit as s 3 . This gives a total Hamming distance of two for the 
incorrect path. Thus the receiver will select the correct path (the path 
with the smallest Hamming distance). 

0. SIMULATION RESULTS 

We present the results of simulating two different systems in this 
.section. The first set of results were obtained using a nonideal source 
. coder with the decoder proposed in Section 3. The second set of results 
pertain to the system proposed in the previous section. In both casts 
the data compression scheme is a DPCM system with a fixed one tap 
predictor cud a nonunifoim Lloyd-Max quantizer. 

The source for the first set of results is the USGGIRL image. 
The source coder output transition probabilities were obtained using 
a training set. Tire training image was the USC COUPLE image. 
The performance measure was the Peok-signal-to-noise-ialio (PSNR) 
defined as 

psNH[dB) = ioio g|0 

where x; is the input to the source coder while £,* is the output of the 
source decoder. Figure 4 shows the performance comparison for a two 
bit per pixel system. Most of the performance improvement is available 
at high probabilities of error. At these probabilities of error, however, 
the improvement is substantial. Figure 5 shows the same kind of results 
for a four bit per pixel system. The performance improvement for this 
case arc even more substantial than those for the two-bit system. Two 
Lltings are r-specinlly noteworthy in these results. The first onr is that 
the performance improvement dors not really breomr significant until 
the channel is very noisy. The other is that llir performance curve 
in the high noise rrgton is relatively lint. This menus llinl even very 
noisy channrls may hr. nsahlr for image transmission. Further results 
including perceptual results enu he found t is |l(I). 

'Flic second set of results were oblnlurd using the approach pro- 
posed in .Section 5. The source rnrotlrr was replaced by the proposed 
joint souree/ehnniicl cotter. The. II operator used is the one desert bed 
in the previous section. The source again was the USC-Glll L image, 
ami end of line sy uehronizalion was assumed. Thr performance 
penson is shown in Figure 6. Note that unlike the previous r.i«e. ihc 
performance improvement occurs nl both low and high error probabil- 
ities. This mokes the scheme especially useful for I mnsmisdou at low- 
error rales. 
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7. CONCLUSIONS 

1m ill!* paper wf have presented n MAP Approach lo joint source/ 
ii n r I foilrr ilrsij;n. The nppronfh is based in purl uii l lie fuel lltnl 
vmrrr coders nre. in ^.rnrinl, nnnidcftl ftnc!, therefore, ciinnnl remove 
nil min mlnney from n sourer. This uonidenlily is Inkrn ml vantage 
i«f, ! > y n MAP ilrcmlrr, lu correct errors. 'I'hr ilerodrr is analyzed lo 
idilnin desired proprrlirs for t he encoder output sciptnir.r. A joint 
sourcc/chftr.iifl encoder design approach is presented which incorpo- 
rates the desired properties, And examples ore given which show Dint 
considerable performance improvements can be obtained with the pro- 
posed approach. 
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ri.'ure 1. Alternative Paths it the Receiver 



figure 2. Longer alternative paths 
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rijure 6. Performance Results of Joint Source/Channel Coding Scherr* 


U- 



Appendix 2- Item 3 



IMPLEMENTATION ISSUES IN MAP JOINT 
SOURCE/CHANNEL CODING 

Klialid Sayood* , Jerry D. Gibson® and Puling Liu* 


^Department of Electrical Engineering 
University of Nebraska 
Lincoln, NE 68588-0511 


ABSTRACT 

One of Shannon’s many fundamental contributions was his result 
that source coding and channel coding can be implemented 
separately without any loss of optimality. However, the assump- 
tion underlying this result may at times be violated in practice. 
Various joint sourcc/chnnnel coding approaches have been 
developed for handling such situations. A MAP approach to joint 
source/chnnncl coding has been pioposed which uses a MAP 
decoder and a modification of the source coder to provide error 
correction, Vv’c present various implementation strategies for this 
approach and provide results for an image coding application. 


1. Introduction 

One of Shannon’s many fundamental contributions was his result 
that source coding and channel coding can be treated separately 
without any loss cf performance as compared to an optimum 
system [1], The basic design procedure implied by Shannon’s 
theorems consists cf designing a source encoder w hich changes the 
source sequence into a scries of (approximately) independent, 
equally likely binary digits followed by a channel encoder which 
accepts binary digits and puts them into a form suitable lor 
reliable transmission over the channel 9 [2]. One aspect of the 
overall optimum system not addressed by Shannon is any increase 
in system complexity that results from this separation, and Massey 
[3] and Ancheta [4] showed tint for distortionless transmission of 
the source under the constraint of linear source and channel 
coders, a significant reduction in complexity with equivalent 
performance can be achieved by using a linear joint source/ 
channel cotier. Their scheme also differs from most data com- 
pression s\ stems in that the bulk, of the system complexity is 
transferred to the receiver. 

The theorem that provides the justification for the separate design 
of the source coder and the channel coder, often called the 
Information Transmission Theorem [2], assumes that both the 
source encoder /decoder pair and the channel encoder/decoder pair 
are operating in an optimal fashion. Specifically, the source 
encoder is assumed to present the channel encoder with a sequence 
suitable for optimal channel coding, and the channel encoder/ 
decoder pair is assumed to reproduce the source encoder output at 
the source decoder input with negligible distortion. Unfortu- 
nately, there are practical situations where these assumptions are 
violated-- namely, when the source encoder output contains 
redundancy, which occurs if the source encoder is suboptimal, and 
when the source decoder input differs from the source encoder 
output, which is a result of channel errors. These two situations 
are common occurrences in practical communication systems 
where source and/or channel models are imperfectly known, 
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complexity is a serious issue, or significant delay is not tolerable. 
Various approaches have been developed to handle these two 
situations. These include approaches in which the source and 
channel coding operations arc truly integrated [3-6], approaches 
that cascade known source coders with known channel coders and 
allocate the fixed bit rate to the source coder and channel coder 
to maximize system performance [7- 1 5), approaches in which the 
source coder and/or receiver is modified to account for the 
presence of a given noisy channel [16-26], and approaches which 
use seme knowledge of the source and source coder properties to 
detect channel errors and compensate for their effects [27-35]. 
The research described in this paper is concerned with the 
implementation of a joint source/channel coder design which was 
an extension of the work presented in [31,32]. This approach 
utilizes structure in the source encoder output by using a MAP 
decoder to correct errors introduced by the channel. 

IT. Previous Work 

Based on the MAP design criteria, a decoder structure was 
proposed in [32] which takes advantage of redundancy in the 
channel input sequence to provide A crror correction. The decoder 
maximizes the quantity log P(Y ] Y) where 

y = ovu V 

is tiie channel input sequence while 

y = (>'}>>' z yj 

is the channel output sequence. It a Markov model is imposed on 
the channel input sequence, the path metric can be written as 

log P(Y I Y) = S log P(y. x I Vj-rj-,; (1) 

and 

P(>\ a a j lJ'i-1 ' <V'i " a n> O) 

P(y\ = °n I -U = 1 U-1 = V 

P (Yi = a n I = a 0 p (y\ = O’j -1 = V 

The proof of the above can be found in [31,32]. 

Based on analysis of the decoder a parameter called the error 
correction capability was defined in [35] as 

l - H(y n \y n . y )/\og.\f (3) 

We noted that a desirable property of a joint source channel coder 
would be to increase /. The approach proposed for this requires 
the modification of the source coder. In general, a source coder 
consists of two operations, data compression and data compaction 
[36]. The data compression operation usually consists of redun- 
dancy removal and involves some loss of information. Examples 
of data compression schemes are DPCM, transform coding and 
vector Quantization. The data compaction schemes are informa- 
tion preserving. They may result in a variable rate out. Examples 
include Huffman coding and runlength coding. Generally in 



discussions of joint source/chnnnel coder design, the data 
compaction operations are not included. The reason for this is 
that due to the variable rate output, the data compaction schemes 
are highly vulnerable to channel noise and, therefore, are not 
considered for noisy channel applications. 


where d ■ is the Hamming distance between the binary codewords 
corresponding to a n and a- and n is the number of bits in each of 
the codewords. However, when If a n ) / If a the calculation may 
not be as simple. To see this we need to introduce some more 
notation. Let the codeword corresponding to « n be represented by 


A possible approach to achieving the objective of increasing / is 
to insert an invertible transoperation between the data compres- 
sion and data compaction stages. An example of such an opera- 
tion called the tt operator, was presented in [35]. The operation 
can be described as follows. Let the input to the n operator be 
selected from the alphabet 

S = (s 0 . S v S 2 

and Id i he output alphabet be denoted by 

A - (%■ *i- VV- 


Then, if IfaJ is less than If a.) 

V- n ;!; n> ^ fl n k l \> (7) 

and the calculation is still relatively straightforward. However, if 
IfaJ is greater than l(a-), 

>‘(y; = « n I r, = f'j) = /Yh = w, = ", I r, = (S) 

or in more familiar terms 


Then the input/output mapping is given by 


p (y i = n n I - v i = V ' E l 


P(V; 


,1.0 


(9) 


This operator and its effects are described in more detail in [35]. 

While this approach achieves the objective of increasing the error 
correcting capability, it also results in a variable rale system. For 
this situation the branch metric of the form of (2) becomes 
difficult to implement. Wo explain these difficulties and propose 
imple me nta table approximations of the metrics in the next section. 
Section V contains simulation results which demonstrate the 
viability of these approximations. The use of a variable rate coder 
also complicate? the structure of the decoder. In Section IV we 
present a modified Vitorbi decoder which can be used with 

III. D evelopment of the bath Met ric 

Before we begin our discussion of the path metric for variable rate 
case we need to summarize the derivation of (2). The derivation 
consists of two steps. First we show that 

p (>\ -ojl r M = a J’i - 0) 

P(y. = "J .Vj = ap/Yi’i I v i-0 


Then we show that the denominator can be written as 

l ‘(y\ 17 « n i = S 1 P(>‘i (5) 

= a { )P(y- = l.v,., = aj 

Note first that in this derivation the channel input alphabet and 
output alphabet are the same. Wc have assumed hard decision at 
the output of the channel and for a fixed rate coder this translates 
into identical alphabets at the input and output of the channel. 
For the case where we have variable rate codes there is a subtle 
difference. I 11 fact, there are two different ways in which we can 
view the output of the channel. The first approach is to assume 
that there is a Huffman decoder at the output of the channel. The 
Huffman decoder output alphabet is the same as the joint source/ 
channel (JSC) coder output alphabet. Thus the branch metric as 
derived in (2) can be used directly. However, now the computa- 
tion of the individual factors of the branch metric becomes 
somewhat more involved. Specifically, consider the calculation of 
Pfy j = | y. = a.), where the channel is assumed to be a binary 

symmetric channel with known crossover probability p. Let l(a^) 
be the number of bits in the binary codeword corresponding to the 
symbol a^. 

If If o n ) - If a-), as is the case when a fixed rate code is used, then 
P(y> = a n I V’i = ^ P 6 njf l - p/ n ' d nj (6) 


= ‘Wi = "i-W.iv, = ", I r, = v 

where we have used the chain rule and the Markov property ot 
JSC coder output. The second factor in the summand is simply 
the tmnsition probability of the JSC coder output while the first 
factor can be calculated as 


P(>\ * "r. I -*'i ' = v 


U^ !! i ) Pr(a r I a. J III Ua n > Pr(c n | «, 
k-1 r k k s WaO + l k k- 


l(a.) 


as long as l( a^) is less than or equal to !(o } ) + It not we 

simply repeat the process again to obtain 


p (y i “ "n I - V i = = V = S h p(y \ * V V i *2 

" "j-'Vl = V = >: h p (>’x = "n ! ''i “ V’i‘1 
= a h )P(y it2 = "hi >V, » V 


"h I - v i 

r 


( 10 ) 


Again I(a n ) should be less than l(a-) + lfc t{ ) -h l(a h ). 

Obviously this process can continue if there is a large variation in 
the codeword lengths. Therefore, this approach becomes cumber- 
some for moderately large codebooks. 


A somewhat different wav of looking at this issue, suggested in a 
slightly different context by Massey [37], is to block the channel 
output bit stream into fixed length words where the fixed length 
is longer than the longest binary codeword in the channel input. 
Then, the path metric becomes the logarithm of 

P(y x *r\yJP(y. y \y^) (II) 

v { P(y x = r\y,= «O p (y\ = c \\y\^ 

where ;■ denotes the word corresponding to a received block of 
bits. While there are some complications here as well, in the 
interpretation of Pfy- \ y^) t the main difficulty is a computational 
one. The simplest implementation of the JSC decoder requires 
that the path metrics be stored in a lookup table. In the case of 
identical input and output alphabets of size A/, the lookup table 
size is M\ However, with this approach, the lookup table size is 
A/ 2 2 l+1 where L is the longest codeword. This exponential 
increase with even moderate codeword lengths makes this 
approach impractical at least for a lookup table implementation. 
An implementation which does not use a lookup table, and instead 
computes the path metric at each step may still be possible with 
special purpose dedicated hardware. 



Cf i veil i he diff iculties involved with implementation of the exact 
path metric of the MAP JSC decoder, we have proposed two 
approximations which provide a high level of error protection 
while being computationally simple and easy to implement. First 
consider (4). We approximate the denominator as 

Ph'i - r\ y.. y ) = Pfv { = r). 

and therefore the entire expression as 

P()'\ = a j I >'i = '"O'j-i = a m ) ( ,2 > 

/Y,Vj = ?■ | Vj = = Qj I )' H = aj 

P(y i = <) 

where the number of bits in r is the number of bits used to 
represent a-. The denominator is further approximated by 
assuming equally likely reception of bits as 

P(y^ = 'r) = (y) U * i J (13) 

where 1(a) is the number of bits in a. and therefore in r. The 
computation of the path metric then proceeds as follows: the 
conditional probability P(y- = a - | IS reac ^ from a 

lookup table and the transition probability is computed by 
assuming a binary symmetric channel with known crossover 
probability. This form of the path metric is easy to implement 
and the simulation results of Section V show the scheme to be 
highly effective. 

An even simpler approximation is to use the Hamming distance 
between the received bits and the candidate sequence elements as 
the branch and path metric. Of course the candidate sequence 
elements are selected from allowed sequence values. (Recall that 
the t operator, by construction, disallows certain sequences.) We 
present results using this metric in Section V. This approximation 
causes a 'Top in performance from about a half d 13 in the low 
noise region m about 1.5 to 2 d 13 in the high noise region. Given 
the simplicity of implementation for this scheme, this may very 
well be an acceptable cost. 

Once the path metric has been obtained, the decoder structure 
needs to be elucidated. We do so in tire next section. 

1 V . Decoder Structure 

The form of tire path metric in (1) is a familiar one and several 
decoder structures exist which maximize (or minimize) additive 
path, metrics t,f this form. One of the most popular ones is the 
Viterbi decoder structure. Recall that the Viterbi decoder limits 
the total number of candidate paths (solutions) to some finite 
number .*/ where Si is the number of different values a solution 
can take at airy given time increment. This is done by using a 
trellis structure that only includes allowed paths or transitions. 
For the problem considered here M would be the size of the 
output alphabet of the ~ operator. In most applications where the 
Viterbi decoder is used, the codewords are of fixed length and 
therefore the candidate paths are of the same length. This is not 
true in tire current case. However, this problem can be resolved 
rather simply by associating a pointer with each candidate path. 
The pointer counts the number of bits used to form the path it is 
associated with. 

To see how this works consider the following example. Let the 
input alphabet to the tt operator be of size two; S = (s Q .s^). 
Suppose the input sequence to the 7r operator is 

J 0 5 0 S 0 5 \ S Q S 0 
then the output of the rr operator will be 

*0 *0 a \ a 2 n 0 


If the Huffman code for the tt operator output is 
a Q :0 t (iy 10 , ti 2 : 110 , a y 11 1 
then the transmitted binary sequence will be 
0 0 10 1 10 0 

Suppose there is an error in the fourth bit and the received 
sequence is 

00111100 

The decoder operation is shown in Figure 1, where the metric 
being used is the Hamming distance. The branches are labelled 
with a pair of numbers. The first number is the accumulated 
number of bits used by the path that includes that branch while 
the second number is the Hamming distance between the received 
bits and the candidate solution. The receiver assumes a starting 
value of Gq. In the first step there are two possibilities, that the 
transmitted word was or II we assume the transmitted word 
was we use up one bit and the Hamming distance is zero. It 
is assumed then we use two bits and the Hamming distance is one. 
Therefore, the lower branch (to z: 0 ) is labelled 1,0 while the 
branch to a ^ is labelled 2,1. This procedure is continued with 
conflicts being resolved by picking the path with the lower 
Hamming distance. Tire procedure is shown in Figure 1. 

V. Sim ulatio n Result s 


The techniques presented in this paper were applied to an image 
coding scheme. The data compression scheme was a DPCM 
system with a fixed four level nonunit orm Max quantizer and a 
one-tap predictor. The data compaction scheme is a sixteen-ievel 
Huffman coder. The average rate for this system was 2.3 bits per 
pixel. Fnd of line resynchronization is assumed for the receiver. 
A block diagram of the system is shown in figure 2. 

The performance with both metrics is shown in figure 3 and 
Figure 4. Both figures [dot the same results where Figure 3 
emphasizes (he pciformance in i he low noise region and Figure 4 
emphasizes performance at high channel error rates. Inc curves 
arc labeled "Approx 1," "Approx 2," and "No Protection.” 1 he 
curve labeled Approx 1 is lire performance curve lor the system 
which uses the metric approximation of ( 1 2) and (13). 1 he curve 
labeled Approx 2 is the system which uses the second approxima- 
tion, i.e., the Hamming distance between the received bits and the 
candidate sequence elements. The curve labeled "No Protection" 
is the system without the joint sourcc/channel coding scheme. 
Both metric approximations provide a high degree of protection 
for low to moderate channel error rates. At high channel error 
rates, while both the systems provide substantial performance 
improvements over the unprotected system, the system with the 
Hamming distance metric provides lower performance than the 
system with the approximation of (12) and (13). However, as 
mentioned before, this might be a small cost to pay tor the 
simplicity of implementation. 
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Figure 2. Proposed system 



Figure 4. Comparison of performance with 
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